Goto

Collaborating Authors

 Olathe


HySim-LLM: Embedding-Weighted Fine-Tuning Bounds and Manifold Denoising for Domain-Adapted LLMs

Jaberi-Douraki, Majid, Sholehrasa, Hossein, Xu, Xuan, Ramachandran, Remya Ampadi

arXiv.org Artificial Intelligence

The extraction and standardization of pharmacokinetic (PK) information from scientific literature remain significant challenges in computational pharmacology, which limits the reliability of data-driven models in drug development. Large language models (LLMs) have achieved remarkable progress in text understanding and reasoning, yet their adaptation to structured biomedical data, such as PK tables, remains constrained by heterogeneity, noise, and domain shift. To address these limitations, we propose HySim-LLM, a unified mathematical and computational framework that integrates embedding-weighted fine-tuning and manifold-aware denoising to enhance the robustness and interpretability of LLMs. We establish two theoretical results: (1) a similarity-weighted generalization bound that quantifies adaptation performance under embedding divergence, and (2) a manifold-based denoising guarantee that bounds loss contributions from noisy or off-manifold samples. These theorems provide a principled foundation for fine-tuning LLMs in structured biomedical settings. The framework offers a mathematically grounded pathway toward reliable and interpretable LLM adaptation for biomedical and data-intensive scientific domains.


Predictive Modeling and Explainable AI for Veterinary Safety Profiles, Residue Assessment, and Health Outcomes Using Real-World Data and Physicochemical Properties

Sholehrasa, Hossein, Xu, Xuan, Caragea, Doina, Riviere, Jim E., Jaberi-Douraki, Majid

arXiv.org Artificial Intelligence

The safe use of pharmaceuticals in food-producing animals is vital to protect animal welfare and human food safety. Adverse events (AEs) may signal unexpected pharmacokinetic or toxicokinetic effects, increasing the risk of violative residues in the food chain. This study introduces a predictive framework for classifying outcomes (Death vs. Recovery) using ~1.28 million reports (1987-2025 Q1) from the U.S. FDA's OpenFDA Center for Veterinary Medicine. A preprocessing pipeline merged relational tables and standardized AEs through VeDDRA ontologies. Data were normalized, missing values imputed, and high-cardinality features reduced; physicochemical drug properties were integrated to capture chemical-residue links. We evaluated supervised models, including Random Forest, CatBoost, XGBoost, ExcelFormer, and large language models (Gemma 3-27B, Phi 3-12B). Class imbalance was addressed, such as undersampling and oversampling, with a focus on prioritizing recall for fatal outcomes. Ensemble methods(Voting, Stacking) and CatBoost performed best, achieving precision, recall, and F1-scores of 0.95. Incorporating Average Uncertainty Margin (AUM)-based pseudo-labeling of uncertain cases improved minority-class detection, particularly in ExcelFormer and XGBoost. Interpretability via SHAP identified biologically plausible predictors, including lung, heart, and bronchial disorders, animal demographics, and drug physicochemical properties. These features were strongly linked to fatal outcomes. Overall, the framework shows that combining rigorous data engineering, advanced machine learning, and explainable AI enables accurate, interpretable predictions of veterinary safety outcomes. The approach supports FARAD's mission by enabling early detection of high-risk drug-event profiles, strengthening residue risk assessment, and informing regulatory and clinical decision-making.


AutoPK: Leveraging LLMs and a Hybrid Similarity Metric for Advanced Retrieval of Pharmacokinetic Data from Complex Tables and Documents

Sholehrasa, Hossein, Ghanaatian, Amirhossein, Caragea, Doina, Tell, Lisa A., Riviere, Jim E., Jaberi-Douraki, Majid

arXiv.org Artificial Intelligence

Abstract--Pharmacokinetics (PK) plays a critical role in drug development and regulatory decision-making for human and veterinary medicine, directly affecting public health through drug safety and efficacy assessments. However, PK data are often embedded in complex, heterogeneous tables with variable structures and inconsistent terminologies, posing significant challenges for automated PK data retrieval and standardization. In the first stage, AutoPK identifies and extracts PK parameter variants using large language models (LLMs), a hybrid similarity metric, and LLMbased validation. The second stage filters relevant rows, converts the table into a key-value text format, and uses an LLM to reconstruct a standardized, machine-readable table. Evaluated on a real-world dataset of 605 annotated PK tables, including captions and footnotes, AutoPK demonstrates significant improvements in precision and recall over direct LLM baselines. For instance, AutoPK with LLaMA 3.1-70B achieved an F1-score of 0.92 on half-life and 0.91 on clearance parameters, outperforming direct use of LLaMA 3.1-70B by margins of 0.10 and 0.21, respectively. Smaller models such as Gemma 3-27B and Phi 3-12B with AutoPK achieved 2-7 fold F1 gains over their direct use, with Gemma's hallucination rates reduced from 60-95% down to 8-14%. Notably, AutoPK enabled open-source models like Gemma 3-27B to outperform commercial systems such as GPT -4o Mini on several PK parameters. AutoPK enables scalable and high-confidence PK data extraction, making it well-suited for critical applications in veterinary pharmacology, drug safety monitoring, and public health decision-making, while addressing heterogeneous table structures and terminology and demonstrating generalizability across key PK parameters. Personal use of this material is permitted. This is the author's version of the work accepted for publication in: Proceedings of the 2025 IEEE 37th International Conference on Tools with Artificial Intelligence (ICT AI). The final published version will be available via IEEE Xplore.


Detecting Daily Living Gait Amid Huntington's Disease Chorea using a Foundation Deep Learning Model

Schwartz, Dafna, Quinn, Lori, Fritz, Nora E., Muratori, Lisa M., Hausdorff, Jeffery M., Bachrach, Ran Gilad

arXiv.org Artificial Intelligence

Wearable sensors offer a non-invasive way to collect physical activity (PA) data, with walking as a key component. Existing models often struggle to detect gait bouts in individuals with neurodegenerative diseases (NDDs) involving involuntary movements. We developed J-Net, a deep learning model inspired by U-Net, which uses a pre-trained self-supervised foundation model fine-tuned with Huntington`s disease (HD) in-lab data and paired with a segmentation head for gait detection. J-Net processes wrist-worn accelerometer data to detect gait during daily living. We evaluated J-Net on in-lab and daily-living data from HD, Parkinson`s disease (PD), and controls. J-Net achieved a 10-percentage point improvement in ROC-AUC for HD over existing methods, reaching 0.97 for in-lab data. In daily-living environments, J-Net estimates showed no significant differences in median daily walking time between HD and controls (p = 0.23), in contrast to other models, which indicated counterintuitive results (p < 0.005). Walking time measured by J-Net correlated with the UHDRS-TMS clinical severity score (r=-0.52; p=0.02), confirming its clinical relevance. Fine-tuning J-Net on PD data also improved gait detection over current methods. J-Net`s architecture effectively addresses the challenges of gait detection in severe chorea and offers robust performance in daily living. The dataset and J-Net model are publicly available, providing a resource for further research into NDD-related gait impairments.


Decoding Fatigue Levels of Pilots Using EEG Signals with Hybrid Deep Neural Networks

Lee, Dae-Hyeok, Kim, Sung-Jin, Kim, Si-Hyun

arXiv.org Artificial Intelligence

The detection of pilots' mental states is critical, as abnormal mental states have the potential to cause catastrophic accidents. This study demonstrates the feasibility of using deep learning techniques to classify different fatigue levels, specifically a normal state, low fatigue, and high fatigue. To the best of our knowledge, this is the first study to classify fatigue levels in pilots. Our approach employs the hybrid deep neural network comprising five convolutional blocks and one long short-term memory block to extract the significant features from electroencephalography signals. Ten pilots participated in the experiment, which was conducted in a simulated flight environment. Compared to four conventional models, our proposed model achieved a superior grand-average accuracy of 0.8801, outperforming other models by at least 0.0599 in classifying fatigue levels. In addition to successfully classifying fatigue levels, our model provided valuable feedback to subjects. Therefore, we anticipate that our study will make the significant contributions to the advancement of autonomous flight and driving technologies, leveraging artificial intelligence in the future.


Decoding EEG-based Workload Levels Using Spatio-temporal Features Under Flight Environment

Lee, Dae-Hyeok, Kim, Sung-Jin, Kim, Si-Hyun, Lee, Seong-Whan

arXiv.org Artificial Intelligence

The detection of pilots' mental states is important due to the potential for their abnormal mental states to result in catastrophic accidents. This study introduces the feasibility of employing deep learning techniques to classify different workload levels, specifically normal state, low workload, and high workload. To the best of our knowledge, this study is the first attempt to classify workload levels of pilots. Our approach involves the hybrid deep neural network that consists of five convolutional blocks and one long short-term memory block to extract the significant features from electroencephalography signals. Ten pilots participated in the experiment, which was conducted within the simulated flight environment. In contrast to four conventional models, our proposed model achieved a superior grand--average accuracy of 0.8613, surpassing other conventional models by at least 0.0597 in classifying workload levels across all participants. Our model not only successfully classified workload levels but also provided valuable feedback to the participants. Hence, we anticipate that our study will make the significant contributions to the advancement of autonomous flight and driving leveraging artificial intelligence technology in the future.


Classification of Distraction Levels Using Hybrid Deep Neural Networks From EEG Signals

Lee, Dae-Hyeok, Kim, Sung-Jin, Choi, Yeon-Woo

arXiv.org Artificial Intelligence

Non-invasive brain-computer interface technology has been developed for detecting human mental states with high performances. Detection of the pilots' mental states is particularly critical because their abnormal mental states could cause catastrophic accidents. In this study, we presented the feasibility of classifying distraction levels (namely, normal state, low distraction, and high distraction) by applying the deep learning method. To the best of our knowledge, this study is the first attempt to classify distraction levels under a flight environment. We proposed a model for classifying distraction levels. A total of ten pilots conducted the experiment in a simulated flight environment. The grand-average accuracy was 0.8437 for classifying distraction levels across all subjects. Hence, we believe that it will contribute significantly to autonomous driving or flight based on artificial intelligence technology in the future.


Satellite images and machine learning can identify remote communities to facilitate access to health services

#artificialintelligence

Community health systems operating in remote areas require accurate information about where people live to efficiently provide services across large regions. We sought to determine whether a machine learning analyses of satellite imagery can be used to map remote communities to facilitate service delivery and planning. We developed a method for mapping communities using a deep learning approach that excels at detecting objects within images. We trained an algorithm to detect individual buildings, then examined building clusters to identify groupings suggestive of communities. The approach was validated in southeastern Liberia, by comparing algorithmically generated results with community location data collected manually by enumerators and community health workers. The deep learning approach achieved 86.47% positive predictive value and 79.49% sensitivity with respect to individual building detection. The approach identified 75.67% (n 451) of communities registered through the community enumeration process, and identified an additional 167 potential communities not previously registered. Several instances of false positives and false negatives were identified.


5 stories from last week that deserve a second look

PBS NewsHour

The word "Disagree" is seen on the hand of Julia Grabowski during a town hall meeting for Republican U.S. Senator Bill Cassidy in Metairie, Louisiana. News about President Donald Trump -- including an apparently neglected vegetable garden that once belonged to former first lady Michelle Obama -- is inescapable. As The New York Times' Farhad Manjoo wrote, "he is no longer just the message. In many cases, he has become the medium." Mental health professionals in the U.S. have reported that the all-encompassing coverage of the president has induced anxiety and depression, or post-election stress, in many of their patients.